CstMI Figure 1 - Agarose gel showing CstMI cleavage of 
lambda, T7 , phiX17 4, pBR322 and pUC19 DNAs . 
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CstMI Figure 2 - DNA sequence o 

1 ATGGTTATGG CCCCTACGAC 

51 TCTCACCGAA TTCAAACTCC 

101 CGGAAAACCG ACCCGCAACC 

151 GACCTGCTCG ACTGCTTCGG 

2 01 ACGCAGCGCT AAACGCGCTT 

251 TTATGCCGGG CAAAGTCATA 

301 GATGATGCTT ATGCCCAAGC 

351 GAACTCGCAC ATGCCGGCCT 

401 GGGTTACCCG TCTTAACCGC 

451 ATTACATTCC CTTTAGCTGA 

501 TCTCGCCGAC TATGAAACCT 

551 TGGAAGCCTC TCGGTTAATG 

601 GACGTGGACG AGGCAGTAGG 

651 AGACGAGC GC GTCATGCGCA 

701 TTCTCTTCGG CGACGACGCA 

751 G AC TTTGTGC GCAATGAAAC 

801 TGAGC TATTT AGCGTGCTTA 

851 TGCCATCAAC GTTGGCGAAG 

901 GAACCGTTGG CCTCGGAGTA 

951 TGCTGCCTGC GACTTCGACT 

1001 CGTTGTTCCA ATTGGTGAAA 

1051 CACTACACGT CTAAGGCCAA 

1101 GGACGAGCTG AGGGCTGAGG 

1151 CGGTGGCCGC ATTAGAGCGC 

1201 GCTGATATGG CTTGTGGTTC 

1251 GTTGCGCCGG ATTGAAACCG 

13 01 GTGAAACGGG CATGTCGTTG 

1351 GGGCAGTTCT ACGGCATTGA 

1401 GACTGCCATG TTCCTAGTTG 

1451 CTGTGGGTAG GCCTCCGGAG 

1501 GTGCACGGCA ATGCCCTGCA 



: the CstMI gene locus (SEQ ID NO:). 
TGTTTTTGAC CGCGCTACCA TTCGCCACAA 
GGTGGC TTGA CCGCATTAAG CAATGGGAGG 
GAGTCGAGTC ACGACCAACA GTTCTGGGGT 
TGTCAACGCC CGCGACCTGT AC TTGTAC C A 
CGACGGGGCG CACCGGCAAG ATCGACATGT 
GGCGAGGCTA AGTCCCTCGG CGTCCCGCTC 
TTTGGATTAT TTGCTGGGCG GTACTATCGC 
ATGTTGTCTG CTCCAACTTC GAGACCCTGC 
ACCTATGTCG GCGATAGCGC C G AC TGGG AC 
GATTGACGAG CACATCGAAC AACTCGCTTT 
CCGCCTACCG GGAGGAAGAA AAGGCTTCCC 
GTGGAGCTCT TCCGCGCCAT GAACGGCGAC 
CGATGACGCT CCCACCACGC CGGAGGAAGA 
CCTCTATCTA CCTCACCCGA ATCCTCTTCC 
GGACTCTGGG ATACCCCGCA TTTGTTTGCG 
CACCCCAGAA TCGCTCGGCC C GC AGCTCAA 
ATACCGCCCC GGAAAAGCGG CCTAAGCGTT 
TTTCCTTATG TCAATGGTGC CCTATTTGCT 
CTTCGACTAC CAGATGCGCG AAGCATTGCT 
GGTCGACCAT TGACGTCTCC GTCTTTGGTT 
TCGAAGGAAG CGCGCCGCAG CGACGGCGAA 
CATCATGAAG ACCATCGGCC CGCTGTTTTT 
CCGATAAGTT GGTGTCTTCT CCGTCGACGT 
TTCCGCGACT CCCTGTCTGA GCTGG TATTC 
TGGAAAC TTC CTGCTTCTGG CGTATCGGGA 
ACATCATTGT CGCTATACGC CAGCGCCGCG 
AATATTGAGT GGGAGCAGAA ACTGTCCATT 
GCTGAATTGG TGGCCTGCCA AGATTGCTGA 
ACCATCAGGC CAACAAGGAG CTTGCCAACG 
CGGTTGCCGA TTAAGATTAC CGCGCACATT 
GCTTGATTGG GCAGACATAC TCTCGGCTTC 



15 51 TGCCGCCAAG AC GTATATCT TCGGTAACCC GCCGTTTTTG GGGCATGCGA 

1601 CGAGAACTGC TGAACAAGCT CAAGAACTCC GAG AC TTGTG GGGCACTAAG >| 

1651 GACATTTCAC GCTTGGACTA CGTCACCGGC TGGCATGCAA AGTGCTTGGA 

17 01 TTTC TTTAAG TCCCGAGAGG GTCGTTTTGC GTTTGTCACC ACCAATTCAA 

17 51 TTACTCAAGG TGATCAAGTT CCACGGCTAT TTGGGCCTAT C TTC AAAGC A 

1801 GGGTGGCGTA TTCGTTTCGC TCACCGCACG TTTGCGTGGG ACTCTGAAGC 

1851 ACCCGGTAAA GCTGCTGTTC ACTGCGTCAT TGTTGGC TTC GATAAGGAGA 

1901 GTCAACCACG TCCACGTCTG TGGGATTATC CCGATGTAAA GGGCGAGCCA 

1951 GTCTCAGTGG AAGTAGGCCA GTC CATTAAT GCCTATTTAG TAGAC GGCCC 

2001 TAATGTTCTT GTCGATAAAT CCCGGCATCC TATTTCGTCG GAAATATCGC 

2051 CCGCAACTTT TGGAAATATG GCGCGAGATG GCGGCAACCT TCTAGTTGAG 

2101 GTCGACGAAT ACGACGAGGT TATGAGTGAC CCCGTAGCGG CAAAGTATGT 

2151 TCGCCCTTTC CGGGGTAGTC GAGAGC TAAT GAACGGCTTA GATCGGTGGT 

22 01 GTCTATGGCT TGTAGATGTA GCACCGTCAG AC ATTGC C C A GAGTCCGGTT 

22 51 CTGAAAAAGC GTC TAG AAGC GGTTAAGTCT TTTCGAGCCG ACAGTAAAGC 

23 01 GGCAAGTACA CGGAAAATGG CTGAAACTCC GCACTTATTC GGCCAGCGGT 
23 51 CGCAACCGGA TACTGATTAC CTTTGCCTGC CGAAGGTAGT AAGCGAACGC 
2401 CGCTCGTATT TCACCGTACA AAGGTATCCA TCAAACGTAA TCGCTTCTGA 
2451 CCTAGTATTC CATGCTCAAG ATCCAGACGG CCTGATGTTT GCGCTAGCGT 
2501 C GTC GTC GAT GTTCATTACG TGGCAGAAAA GCATCGGAGG ACGACTCAAG 
2551 TCTGATCTCC GTTTTGCTAA CACTTTGACG TGGAATACTT TCCCAGTGCC 
2 6 01 AGAACTCGAC GAGAAGACGC GGCAGCGAAT TATTAAAGCG GGCAAGAAGG 
2651 TGCTCGACGC CCGCGCGCTG CACCCAGAAC GCTCGCTGGC CGAGCACTAC 
27 01 AACCCACTCG CGATGGCACC GGAACTCATC AAAGCGCATG ATGCGCTCGA 
27 51 CCGCGAGGTG GATAAAGCGT TTGGCGCGCC ACGAAAGCTG AC AAC TGTTC 
2801 GGCAGCGCCA GGAGCTATTG TTTGC C AATT ACGAAAAACT CATCTCACAC 
2851 CAGCCCTAG 



CstMI Figure 3: Amino acid sequence of the CstMI gene locus {SEQ 
NO: ) . 

1 MVMAPTTVFD RATIRHNLTE FKLRWLDRIK QWEAENRPAT ESSHDQQFWG 

51 DLLDCFGVNA RDLYLYQRSA KRASTGRTGK IDMFMPGKVI GEAKSLGVPL 

101 DDAYAQALDY LLGGTIANSH MPAYWCSNF ETLRVTRLNR TYVGDSADWD 

151 ITFPLAEIDE HI EQLAFLAD YETSAYREEE KASLEASRLM VELFRAMNGD 

201 DVDEAVGDDA PTTPEEEDER VMRTSIYLTR ILFLLFGDDA GLWDTPHLFA 

2 51 DFVRNETTPE SLGPQLNELF SVLNTAPEKR PKRLPSTLAK FPYVNGALFA 

3 01 EPLASEYFDY QMREALLAAC DFDWSTIDVS VFGSLFQLVK SKEARRSDGE 
3 51 HYTSKANIMK TIGPLFLDEL RAEADKLVSS PSTSVAALER FRDSLSELVF 
401 ADMACGSGNF LLLAYRELRR IETDIIVAIR QRRGETGMSL NIEWEQKLSI 
451 GQFYG I ELNW WPAKIAETAM FLVDHQANKE LANAVGRPPE RLPIKITAHI 
501 VHGNALQLDW ADILSASAAK TYIFGNPPFL GHATRTAEQA QELRDLWGTK 
551 DISRLDYVTG WHAKCLDFFK SREGRFAFVT TNSITQGDQV PRLFGPIFKA 
601 GWRIRFAHRT FAWDSEAPGK AAVHCVIVGF DKESQPRPRL WDYPDVKGEP 
651 VSVEVGQSIN AYLVDGPNVL VDKSRHPISS EISPATFGNM ARDGGNLLVE 
701 VDEYDEVMSD PVAAKYVRPF RGSRELMNGL DRWCLWLVDV APSDIAQSPV 
751 LKKRLEAVKS FRADSKAAST RKMAETPHLF GQRSQPDTDY LCLPKWSER 
8 01 RSYFTVQRYP SNVIASDLVF HAQDPDGLMF ALASSSMFIT WQKSIGGRLK 
851 SDLRFANTLT WNTFPVPELD EKTRQRI IKA GKKVLDARAL HPERSLAEHY 
901 NPLAMAPELI KAHDALDREV DKAFGAPRKL TTVRQRQELL FANYEKLISH 
951 QP 



Figure 4 - Agarose gel showing CstMI protection of 
pTBCstMI.3 DNA and cleavage of unmodified DNA substrate. 




CstMI Figure 5 : Determination of the CstMI cleavage 
site . 

Figure 5A: Location of cleavage on 5 1 -AAGGAG-3 1 strand. 

pUC19-Adeno2BC4 DNA was cut with CstMI producing 
ends as indicated by the arrows: 

5 ■ - . . cgaacccaggtgtgcgacgItcagacaacgggggagcgCTCCTTttg . . - 

3 ' 

(SEQ ID NO:3) 

3 1 - . . gcttgggtccacacgctTgcagtctgttgccccctcgc GAGGAA aac . . - 

5 ' 

The resulting cleaved DNA: 

5 1 - . . CGAACCCAGGTGTGCGACG-3 ' ( SEQ ID NO : 4 ) 
3 ' - . .GCTTGGGTCCACACGCT-5 ' 

The template strand for dideoxy DNA sequencing extension: 
3 ' - . . GGGTCCACACGCT-5 ' 

The primer (NEB1224) is annealed and extended through the 
CstMI site. When the reaction reaches the end of the 
molecule the Taq polymerase adds an extra A base. 

5 ' - PRIMER- > . . CGAACCCAGGTGTGCGA ( A ) - 3 ' (SEQ ID N0:5) 
3 ' - GCTTGGGTCCACACGCT- (N2 0-GAGGAA) -5 ' 

Sequencing Profile of CstMI cut pUC19-Adeno2BC4 DNA 

(ABI377 Sequencer) 

^60, ; , <&a, , , yu6, . , iKfl, , , t&6o, ; , laa, ; ; rgg , ; , rag ; , g 

CGAACCCAGG TGTGCGANGTCAG ACAACGGGGG AGCGCTCCTTTTGGCTTCCTTCCAGGCG> 
110 120 130 140 150 160 



CstMI Figure 5: Determination of the CstMI cleavage site. 

Figure 5B: Location of cleavage on 5 ■ -CTCCTT-3 1 strand. 

pBR322 DNA was cut with CstMI, yielding ends indicated by 
the arrows : 

5 ' - . . TGCATGCAAGGAGATGGCGCCCAACAGTCCCCCiGGCCACGGGGCC . . - 
3' 

( SEQ ID NO : 6 ) 

3 ■ - . . acgtacg TTCCTC taccgcgggttgtcagggTggccggtgccccgg . . - 

5' 

The resulting cleaved DNA: 

5 • - . , tgcatgc AAGGAG atggcgcccaacagtccccc -3 • 

(SEQ ID NO: 7) 

3 ' - . . acgtacgTTCCTCtaccgcgggttgtcaggg -5 ' 

The template strand for dideoxy DNA sequencing extension: 
3 ' - . . acgtacgTTCCTCtaccgcgggttgtcaggg -5 ' 



The primer (NEB1242) is annealed and extended through the 
CstMI site. When the reaction reaches the end of the 
molecule the Taq polymerase adds an extra A base. 

5 ' - PRIMER- > . TGCATGC AAGGAGaTGGCGCCCAACAGTCCC ( A ) - 3 ' 
(SEQ ID NO: 8) 

3'- ACGTACGTTCCTCTACCGCGGGTTGTCAGGG -5' 



Sequencing Profile of CstMI pBR322 DNA (ABI3 77 Sequencer) 

1600, < | 1680, t , 1 760, t , 1840, i 1 920, § , 2000, _ ( 2080, [ 



TGCATGCAAGG AGATGGCGC CCAACAG TCCCCNGGCCACGGGG CC 
• 140 150 160 170 




CstMI Figure 6: Sequence alignment of CstMI and Mmel amino acid 
sequences . 



Symbol comparison table: /gcg/bin/gcgcore/data/rundata/blosum62 . cmp 
CompCheck: 1102 

Gap Weight: 8 Average Match: 2.778 

Length Weight: 2 Average Mismatch: -2.248 

Quality: 1548 Length: 942 

Ratio: 1.718 Gaps: 19 

Percent Similarity: 51.009 Percent Identity: 39.574 

Match display thresholds for the alignment ( s ) : 
| = IDENTITY 
: = 2 
. = 1 

CstMI. pep x Mmel. pep June 20 # 2003 11:45 

2 0 EFKLRWLDRIKQWEAENRPATESSHDQQFWGDLLDCFGVNARDLYLYQRS 69 
| : : :: |.|| . | . | . | | : | | : : . : : . 

7 EIRRKAIEFSKRWE. . . DASDENSQAKPFLIDFFEVFGITNKRVATFEHA 53 

7 0 AKRASTGR. . . . TGKIDMFMPGKVIGEAKSLGVPLDDAYAQALDYLLGGT 115 

h . I :|:| II -INI II II Mill I 

54 VKKFAKAHKEQSRGFVDLFWPGILLIEMKSRGKDLDKAYDQALDYFSG. . 101 

116 IANSHMPAYWCSNFETLRVTRLNRTYVGDSADWDITFPLAEIDEHIEQL 165 

II :| II- -h Ml H : I I - = 

102 IAERDLPRYVLVCDFQRFRLTDL . . . ITKES . . . . VEFLLKDLYQNVRSF 144 

16 6 AFLADYETSAYREEEKASLEASRLMVELFRAMNGDDVDEAVGDDAPTTPE 215 

|:| |:| : :: .:.|. | .| | . || : 

145 GFIAGYQTQVIKPQDPINIKAAERMGKL HDTLKLVGYEGHA. . . 185 

216 EEDERVMRTS IYLTRILFLLFGDDAGLWDTPHLFADFVRNETTPE . . SLG 2 63 

:|| |:|| || :| :.: || ::: .| : | 
18 6 LELYLVRLLFCLFAEDTTIFE . KSLFQEYIETKTLEDGSDLA 22 6 

264 PQLNELFSVLNTAPEKRPKRLPSTLAKFPYVNGALFAEPLASEYFDYQMR 313 

:| II I I I I HI I I II llhll II III II II 

22 7 HHINTLFYVLNTPEQKRLKNLDEHLAAFPYINGKLFEEPLPPAQFDKAMR 276 

314 EALLAACDFDWSTIDVSVFGSLFQLVKSKEARRSDGEHYTSKANIMKTIG 3 63 

I I I I I III I -Mill : • I I - I I I I I - I I I : I I 
277 EALLDLCSLDWSRISPAIFGSLFQSIMDAKKRRNLGAHYTSEANILKLIK 326 

3 64 PLFLDELRAEADKLVSSPSTSVAALERFRDSLSELVFADMACGSGNFLLL 413 

lllllll I • I I I I II I III IMM: 

327 PLFLDELWVEFEKVKNNKNKLLA .... FHKKLRGLTFFDPACGCGNFLVI 372 

414 AYRELRRIETDIIVAIRQRRGETGMSLNIEWEQKLSIGQFYGIELNWWPA 463 

HIM :| ::: : | |: |.|| .:.: ||:|||: .|| 

373 TYRELRLLEIEVLRGL . HRGGQ . . QVLDIEHLIQINVDQFFGIEIEEFPA 419 

464 KIAETAMFLVDHQANKELANAVGRPPERLPIKITAHIVHGNALQLDWADI 513 

MM MM III I I IMM I IM. IIIMM M 

420 Q I AQVALWLTDHQMNMK I S DEFGNYFAR I PLKSTPH I LNANALQIDWNDV 469 
514 LSASAAKTYIFGNPPFLGHATRTAEQAQELRDLWGT . KDISRLDYVTGWH 562 

I I : I IIIMM • M I M I IIIMM 



470 LEAKKC . CFILGNPPFVGKSKQTPGQKADLLSVFGNLKSASDLDLVAAWY 518 



563 AKCLDFFKSREG . RFAFVTTNSITQGDQVPRLFGPIFKAGWRIRFAHRTF 611 

I : •• I IMMIIIMIMI |. : | :| I I I I I I 

519 PKAAHYIQTNANIRCAFVSTNSITQGEQVSLLWPLLLSLGIKINFAHRTF 568 

612 AWDSEAPGKAAVHCVIVGFDKESQPRPRLWDYPDVKGEPVSVEVGQSINA 661 

•I -II I MINIMI . :::| : |||..:- .-II 

569 SWTNEASGVAAVHCVIIGFGLKDSDEKIIYEYESINGEPLAIK . AKNINP 617 

662 YLVDGPNVLVDKSRHPISSEISPATFGNMARDGGNLLVEVDEYDE.VMSD 710 

II II -h I • II I-: Ml I II I :| : .= 

618 YLRDGVDVIACKRQQPI . SKLPSMRYGNKPTDDGNFLFTDEEKNQFITNE 666 

711 PVAAKYVRPFRGSRELMNGLDRWCLWLVDVAPSDIAQSPVLKKRLEAVKS 760 

i . ii i i i i -i Mini hi •• |:. 

667 PSSEKYFRRFVGGDEFINNTSRWCLWLDGADISEIRAMPLVLARIKKVQE 716 

761 FRADSKAASTRKMAETPHLFGQRSQPDTDYLCLPKWSERRSYFTVQRYP 810 

II I I II- I II I MINIM M. Ill:: 

717 FRLKSSAKPTRQSASTPMKFFYISQPDTDYLLIPETSSENRQFIPIGFVD 7 66 

811 SNVI ASDLVFHAQD PDGLMFALAS S SMF I TWQKS IGGRLKSDLRF ANTLT 860 

MM. :| : U I MM I :-MMIM |:. - I 

767 RNVISSNATYHIPSAEPLIFGLLSSTMHNCWMRNVGGRLESRYRYSASLV 816 

861 WNTFPVPELDEKTRQRIIKAGKKVLDARALHPERSLAEHYNPLAMAPELI 910 

MMI : Ml I • I M II- M III IM I Ih 

817 YNTFPWIQPNEKQSKAIEEAAFAILKARSNYPNESLAGLYDPKTMPSELL 866 

911 KAHDALDREVDKAFGAPRKLTTVRQRQELLFANYEKLISHQP 952 

III Ih MM I : I II hh I I 

867 KAHQKLDKAVDSVYGFKGPNTEI . ARIAFLFETYQKMTSLLP 907 




FIGURE 7 




